Isolated cocoon silk protein from simulium vittatum and nucleic acids encoding such protein

ABSTRACT

An isolated nucleic acid molecule encoding the cocoon silk protein from the black fly,  Simulium vittatum . Also provided are the amino acid sequence derived from the cocoon silk, primers used to screen cDNA libraries to promote the building of a complimentary strand of DNA encoding the cocoon silk protein, a transformed microorganism containing cDNA which codes for cocoon silk protein, the amino acid sequence translated from the isolated gene of the cocoon silk (deduced from nucleotide sequence), primers for constructing a segment of recombinant DNA.

FIELD OF THE INVENTION

[0001] The present invention relates to the cocoon silk protein isolatedfrom the black fly, Simulium vittatum.

BACKGROUND OF THE INVENTION

[0002] Silk is a natural, protein filament fiber. Several types ofnatural silk that are known to date are excreted by invertebrates suchas those that belong to two classes of the phylum Arthropoda: Insectaand Arachnida. Silk producing insects include silk worms, black flies,wasps, and lacewing flies.

[0003] Some arthropods' silk have now been cloned. For example, Lewis,R. V. et al. (U.S. Pat. No. 5, 728,810) teach the preparation of spidersilk protein by recombinant DNA techniques. Lewis, R. V., et al. (U.S.Pat. No. 5,733,771) teach a cDNA encoding minor amputate silk proteins.Lombardi, S. J. et al (U.S. Pat. No. 5,245,012) teach a recombinantspider silk protein which can be obtained in a commercially useful formby the cloning of host cells encoding such protein.

[0004] Another silk producing arthropod, the black fly, evolved toproduce a very durable silk filament. Silk is produced by the black fly“larva” which forms a cocoon. The larva and pupae are aquatic but areconfined to running waters where they attach themselves to firmsubstrates. The black fly's silk filament is able to withstand theexposure to water flow in order to keep the pupa inside the cocoonintact. Another remarkable property of the Simuliidae silk is itsability to maintain its adhesive characteristic while submerged inwater. These properties are very attractive in terms of possibleapplication of the black fly silk as a biomaterial.

[0005] The above prior art references are incorporated herein byreference.

[0006] The prior art does not teach the isolation of a nucleic acidmolecule coding for the silk protein from black flies. Further, theprior art does not teach the expression of such silk protein usingrecombinant DNA techniques.

SUMMARY OF THE INVENTION

[0007] In one embodiment, the present invention provides an isolatedpolypeptide molecule having an amino acid sequence of SEQ ID NO: 1. In afurther embodiment, the present invention provides an isolated nucleicacid molecule coding for such polypeptide. In another embodiment, thepresent invention provides the nucleic acid molecule comprising thenucleotide sequence of SEQ ID NO: 6. In a further embodiment, thepresent invention provides a polypeptide molecule having an amino acidsequence of SEQ ID NO: 7 and expressed by such nucleic acid molecule. Inyet another embodiment, the present invention provides an isolatednucleic acid molecule coding for such polypeptide. In addition, thepresent invention provides a cloning vector comprising the nucleotidesequence of SEQ ID NO: 6 and a host cell transformed with such vector.In a further embodiment, the present invention provides a fiber formedfrom polypeptide of SEQ ID NO: 7. In yet another embodiment, the presentinvention provides a method of isolating cocoon silk protein comprisingthe steps of: a) boiling a cocoon in a sample reducing buffer to removeSDS-soluble proteins, b) centrifuging the sample, c) withdrawingsupernatant, adding formic acid to the pellet and incubating the samplein order to solubilize SDS-insoluble proteins, d) freezing andlyophilizing the sample in order to freeze-dry the sample e)re-suspending the dried sample in TEPI to protect proteins againstpotential residual proteolytic activity for subsequent analysis usingSDS-PAGE.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0008] In a preferred embodiment, the present invention provides anisolated nucleic acid molecule encoding the silk protein of black flycocoons. The invention also provides amino acid molecules expressed bysuch nucleic acid as well as cells transformed with such nucleic acidmolecule. Also provided are primers for screening DNA libraries for theDNA encoding the subject silk protein.

[0009] Methods and Results of Research

[0010] 1. Method of Isolating and Purifying Cocoon Silk Protein FromSimulium vittatuni

[0011] The following method was developed for isolating and purifyingcocoon silk protein from the black fly, Simulium vittatum. In thepreferred embodiment, the method comprises the following procedure. Asingle cocoon from S. vittatum was boiled in 500 μl sample reducingbuffer for one minute and then centrifuged for one minute at 13,000 rpm.Boiling in sample reducing buffer removes any SDS-soluble proteins. Thecomposition of sample reducing buffer is as follows:

[0012] 1 mL 0.5M Tris-HCl, pH 6.8

[0013] 0.8 mL glycerol

[0014] 1.6 mL

[0015] 10% SDS (sodium dodecyl sulfate)

[0016] 0.4 mL 2β mercaptoethanol

[0017] 0.2 mL 0.5% bromophenol blue

[0018] 4 mL H₂O

[0019] The supernatant was then removed and the sample was washed 4times with dH₂O by spinning the sample down and pouring off waterbetween washes. Then the pellet was re-suspended in 500 μl of 90% formicacid and the sample was incubated in a shaker at 22° C. for 1 hour inorder to solubilize the SDS-insoluble proteins. The sample was thenfrozen and lyophilized in order to be freeze-dried. Then the sample(formic acid and cocoon) was transferred to a 50 mL centrifuge tube andthe volume was increased to 50 mL by the addition of dH₂O. Thiscentrifuge tube was then frozen in a −70° C. freezer and the sample waslyophilized. After that, the sample was re-suspended in 400 μl of TEPI.TEPI buffer contains:

[0020] 10 mM Tris-HCl, pH 8.0

[0021] 1 mM EDTA (ethylenediaminetetraacidic acid)

[0022] 1 μM phenylmethylsulfonylfluoride (PMSF)

[0023] 100 μM iodoacetamide

[0024] Re-suspension in TEPI protected proteins against potentialresidual proteolytic activity for subsequent analysis using SDSpolyacrylamide gel electrophoresis (SDS-PAGE). The sample was then runon a SDS-polyacrylamide gel in duplicate using standard proceduresoutlined in Laemmli (Cleavage of Structural Proteins During the Assemblyof the Head of Bacteriophage T4, Nature, 1970, 227:680-685, the contentof which we incorporate herein by reference). One gel was silver stainedand the other was transferred to a poly-vinylidene-difluoride (PVDF)membrane which was stained with Ponceau stain. The band on the gel thatcorresponded to the cocoon silk protein of S. vittatum was excised usinga razor blade and sent to the Centre de Recherche du CHUL (Quebec,Canada) for N-terminal amino acid sequencing.

[0025] 2. N-Terminal Amino Acid Sequence For Black Fly Cocoon Silk

[0026] The N-terminal amino acid sequencing of the silk protein isolatedabove revealed the following sequence:

[0027] GVAPKKYRKGHYVGGYGKKY SEQ ID NO: 1

[0028]3. cDNA Construction

[0029] In the preferred embodiment, cDNA was constructed as follows.Salivary glands were dissected from 10 S. viltatum larvae and placedinto an RNAse free Eppendorf tubule, on ice. After that, 1 mL of TRIZOL™reagent (Life Technologies Inc.) was added. Total RNA was recoveredusing manufacturer's instructions.

[0030] Poly A⁺ mRNA was then isolated from the total RNA using Qiagen'sOligotex™ mRNA Kit. Oligotex provides a hybridization carrier on whichnucleic acids containing polyadenylic acid sequences can be simply andefficiently immobilized and easily recovered. Briefly, the Oligotexprocedure for isolation and purification of poly A⁺ mRNA takes advantageof the fact that most eukaryotic mRNAs end in a homopolymer of 20-250adenosine nucleotides, known as the poly A tail. The poly A tail isadded to the RNA transcript in the nucleus following transcription. Incontrast, structural RNAs are not polyadenylated. Nuclearpolyadenylation of mRNAs performed by the eukaryotic cell providesmolecular biologists with a useful tool for separation or selectiveisolation of poly A⁺ mRNAs from total cellular RNA. Separation of poly aA⁺ mRNAs from rRNA and tRNA can be achieved by hybridizing thepolyadenylated tails of mRNA molecules to oligo dT primers which arecoupled to a solid phase matrix. RNA species lacking poly A (rRNA andtRNA) fail to bind to oligo dT and are removed. Since high saltconditions are necessary to allow hybridization, the poly A⁺ mRNA cansubsequently be released by lowering the ionic strength anddestabilizing the dT:A hybrids.

[0031] Upon the poly A⁺ mRNA isolation, a cDNA library was constructedusing RT-PCR (reverse transcription—polymerase chain reaction) followingthe Omniscript Protocol for Reverse Transcription (Omniscript ReverseTranscriptase Handbook, 1999, the content of which we incorporate hereinby reference). Reverse transcriptase is a multifunctional enzyme withseveral distinct enzymatic activities, two of which, an RNA-dependantDNA polymerase and a hybrid-dependent exoribonuclease (RNase H), areutilized for reverse transcription in vitro to produce single-strandedcDNA with RNA as a starting template. The RNA-dependent DNA-polymeraseactivity (reverse transcription) transcribes cDNA from an RNA templatewhich allows synthesis of cDNA for subsequent PCR. An exoribonucleaseactivity (RNase H) of Omnicript Reverse Transcriptase specificallydegrades only the RNA in RNA:DNA hybrids. This Omniscript RNAse Hactivity affects RNA that is hybridized to cDNA and also improves thesensitivity of subsequent PCR.

[0032] The reverse-transcription (RT) reaction conditions were asfollows: 10X Buffer RT 2.0 μL dNTP mix (5 mM each dNTP) 2.0 μL Oligo-dTprimer (SEQ ID NO: 3) 10 μM 2.0 μL RNase inhibitor (10 units/μL) 1.0 μLOmniscript Reverse Transcriptase (4 units/μL) 1.0 μL RNase-free water9.0 μL Template poly A + RNA (˜25 ng/μL) 3.0 μL Total  20 μL

[0033] 4. 60-Nucleotide Primer Used to Screen cDNA Library For CocoonSilk Protein Transcript.

[0034] Two primers may be preferably used to promote the building of anew strand of DNA encoding the cocoon silk protein after DNA strandswere separated by heating during the PCR process.

[0035] Primer #1, the cocoon silk protein primer, was a degenerateprimer of the following structure: 5′ end GGN GTN GCN CCN AAN AAN TANCGN AAN GGN CAN TAN GTN SEQ ID NO: 2 GGN GGN TAN GGN AAN AAN TAN

[0036] Primer #2 was a poly-T primer of the following structure:5′-TTTTGTACAAGCTT₃₀N₂-3′, SEQ ID NO: 3 where N can be any of A, T, G orC.

[0037] The conditions of the polymerase chain reaction were as follows:

[0038]

[0039] 1. The PCR mixture, using the Qiagen kit, Catalogue No. 201203,consisted of: Q-solution 10X 4 μL 10X PCR Buffer (with 15 Mm MgCl₂) 2 μLdNTPs solution containing 10 mM of each dNTP 2 μL MgCl₂ 25 mM 1 μL 10 μMOligo-dT primer (SEQ ID NO: 3) 1 μL 85 pmoles/μL cocoon silk proteinprimer (SEQ ID NO: 2) 0.4 μL Taq polymerase (5 units/μL) 0.2 μL Template(finished RT product, ˜25 ng/μL) 4 μL dH₂O 5.4 μL Total 20 μL

[0040] For PCR following RT, Omniscript recommends no more than ⅕ of thetotal reaction volume should be derived from the finished RT product.The maximum recommended was used, i.e. 4 μL of 20 μL.

[0041] 2. The thermocycler program was as follows: 1) 95° C. 15 min 2)94° C.  2 min 30 sec 3) 55° C.  3 min 4) 72° C.  2 min 30 sec 5) 72° C. 5 min final extension

[0042] Steps 2-4 were run for 45 cycles. The sample was then run on anethidium bromide gel and a single band <750 bp was visualized.

[0043] 5. Ligation of RT-PCR Product Using pGEM-T™ Vector System FromPromega

[0044] The RT-PCR product of step 4 was then ligated preferably usingpGEM-TTM Easy Vector System from Promega (Cat. No. A3600). The resultantDNA from the RT-PCR reaction was purified using a GFX™ PCR DNA and GelBand Purification kit (Amersham Pharmacia Biotech, Cat. No. 27-9602-01)according to manufacturer's instructions and eluted in 40 μL dH₂O. Theabove purification removes salts, enzyme, unincorporated nucleotides andpromoters from PCR products. The resulting concentration of RT-PCR DNAwas approximately 20 ng/μL. This purified RT-PCR DNA, approximately 0.7kb in length, was then used as an insert for ligation into a pGEM-T™Vector plasmid following the steps in “The Experienced User's Protocolfor Promega pGEM-T™ Vector Systems”, the content of which isincorporated herein by reference. The ligation mixture used was asfollows: 2X Rapid Ligation Buffer, T4 DNA ligase 5 μL pGEM-T vector (50ng) 1 μL purified RT-PCR DNA (20 ng/μL) 3 μL T4 DNA ligase (3 WeissUnit/μL) 1 μL Total 10 μL 

[0045] 6. Transformation of E. coli XL1 Blue cells

[0046]E. coli XL1 Blue cells were transformed with the ligation mixtureof step 5 as follows. E. coli XL1 Blue cells (Stratagene) were madecompetent, i.e. those cells were treated to enhance their ability totake up DNA. Protocol to make cells competent was modified fromSambrook, J., Fritsch, E. F., and Maniatis, T., 1989, Molecular Cloning:A Laboratory Manual, 2^(nd) Edition, Cold Spring Harbor LaboratoryPress, the content of which we incorporate herein by reference. Theactual procedure for making E. coli XL1 Blue cells (Stratagene)competent was as follows.

[0047]E. coli strain XL1-Blue cells were grown for 18 hours in 5 ml ofLB broth at 37° C. and 250 rpm shaking (LB is Luria-Burtani Medium (pH7.0) containing 2 g bacto-tryptone, 1 g bacto-yeast extract, 2 g NaCl in200 mL dH₂O). Then, 200 μl of the above mixture with E. coli cells wastransferred into 50 ml of new LB broth and grown for 3 hours at 37° C.and 250 rpm shaking. After that, the mixture was centrifuged at 7.5K rpmfor 3 minutes and supernatant was discarded. The cells were thenre-suspended in 5 ml of Buffer A. The composition of the Buffer A was asfollows: 100 mM NaCl, 5 mM MgCl₂, 5 mM Tris-HCl, pH 7.5. Re-suspended E.coli cells were incubated on ice for 10 minutes and centrifuged at 7.5Krpm for 3 minutes. After that, a supernatant was discarded and a residuere-suspended in 5 ml of Buffer B. The composition of the Buffer B was asfollows: 100 mM CaCl₂, 5 mM MgCl₂, 5 mM Tris-HCl, pH 7.5. The resultingmixture with E. coli cells was incubated on ice for 30 minutes and thecells became competent. 10 μL of the ligation mixture (step 5) was addedto 190 μL of the competent cells. The ligation mixture with thecompetent cells was incubated on ice for 1 hour, then subjected to aheat shock at 42° C. for 90 seconds, and then again incubated on ice for5 minutes. After that, I mL of LB broth was added and E. coli cells weregrown at 37° C. and 250 rpm shaking for one hour.

[0048] The resulting transformed cells were plated into LB/amp/IPTG/Xgalplates. LB/amp/IPTG is Luria-Burtani Medium containing 1.5% agar, 75μL/mL ampicillin, with each agar plate subsequently overlaid with 20 Lof a 100 mM solution of isopropyl-thio-beta-D-galactopyranoside in waterand 50 μL of a 2% solution of5-bromo-4-chloro-3-indolyl-beta-D-galactopyranoside in dimethylsulfoxide. This agar medium is referred to as LB/amp/IPTG/Xgal. Afterthat, E. coli colonies were screened to determine which coloniescontained plasmids with the desired DNA insert. The screening is basedon E. coli color change. E. coli that have been transformed with theplasmid that had the insert from a RT-PCR of step 3 and subsequent PCRof step 4 would be white. Those E. coli colonies that have beentransformed with a plasmid that did not contain the desired insert wouldbe blue. Several of the white colonies were tested to make sure thatthey did, in fact, contain the DNA insert in question.

[0049] 7. Plasmid Preparation For Nucleotide Sequencing

[0050] To obtain the nucleotide sequence of the cocoon silk protein,plasmids-were first prepared for sequencing by selecting white coloniesfrom the cells of step 6, growing them overnight, and X then puttingthem through the Biorad plasmid mini-prep kit (Cat. No. 732-6100). Eag 1(New England BioLabs, Cat. No. 505S) was preferably used as therestriction enzyme to digest the plasmid to screen for an insertapproximately 600-700 bp in length.

[0051] 8. Nucleotide Sequence

[0052] T7 and SP6 primers were preferably used to sequence the insert ofstep 7. These primers were provided by the sequencing facility.Sequences for the above primers are as follows:

[0053] Primer # 3, T7 primer: Primer #3, T7 primer: 5′ TAA TACGA CTCACTATAG GGCG A 3′ SEQ ID NO: 4 Primer #4, SP6 primer: 5′ AT TTAGG TGACACTATA GAATA C 3′ SEQ ID NO: 5 The following nucleotide sequence of theinsert of step 7 was derived: 5′ end                                     AG CTC TCC SEQ ID NO: 6 CAT ATG GTCGAC CTG CAG GCG GCC GCA CTA GTG ATTGGA GTT GCT CCA AAG AAG TAC CGC AAG GGA CAC TATGTC GGG GGT TAC GGG AAG AAG TAT CGT ATT TTT GACAGC AAT TGT GCT ATG AAC AAC GCC AAC TGT CAG AATCCA AAC GAA TCC GCC TTC GCC GAA GTT GAT TTC ACGCTG TGC AAT GAT ATC AAA TGT CCT AGG AAA TGC GATAAA AAA CTA GAC CCG GTT TGT GCT TTT GAT GGG AAAACG TAC AGA CAA TTT AAC AAC AAA TGT CTG CTG CAAGAA TTC AAT GAT TGC GAT CAA AAT GTG TTT CAA TATTTC AAC GCT GTG ACT AAC AAA AAA ATG TGC GTG GTTGAG AAG CCA AAA TGC CCG ACC ATT TGT CCA GCA ATTTAT GCT CCC GTT TGT GGT CGA AAT GCC AAA GGG GATTAC AAA AGT TTT GCG AGT GAA TGC AAC CAA TCC GCATTC AAC TGC TTG ATT TCT AAG AAT CAA TAT ACG GGCAAG TAT GAT TTG AGT TTT TGC GAC ATC GAG TTC CCT TAA GCA TGA CGT TGT AACGTT TTT TCT CTG GAT GTG CAA AAC ATA AAT TAC AAG CAC TGG ATT GAA TGG TGTTTT ATT AAA TTT CCT TGT GAC CTT TTT TCC ATT ATT CTT TCC GGC CTT TAA CAAGTA ATC AAT ATT GAT ATC GGT CGT TTT TGT AAA GAT TTT TTT TCA GTA AAA ATATCC ATC TCA TTT TCA CAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAG CTT GTACAA AAA ATC CCG CGG CCA TGG CGG CCG GGA GCA TGC GAC GTC GGG CCC A

[0054] The underlined section of the above sequence corresponds to thededuced reading frame of the black fly cocoon silk protein gene. Ingeneral, the deduced reading frame is the codon sequence that isdetermined by reading nucleotides in groups of three, starting from aspecific start codon. In this case, the initial amino acid sequence wasdetermined from N-terminal portion of the protein and this sequence thencorresponded to the nucleotide sequence when read in triplets (codons).

[0055] 9. Complete Amino Acid Sequence For Cocoon Silk Protein (DeducedFrom Nucleotide Sequence)

[0056] The DNA sequence of step 8 (SEQ ID NO: 6) was assessed for stopcodons and the encoded amino acid sequence was deduced using all of theunderlined nucleotides as shown in SEQ ID No: 6. The amino acid sequencewas deduced to be as follows:GVAPKKYRKGHYVGGYGKKYRIFDSNCAMNNANCQNPNESAFAEVDFTLCNDIKCPR SEQ ID NO: 7KCDKKLDPVCAFDGKTYRQFNNKCLLQEFNDCDQNVFQYFNAVTNKKMCVVEKPKCPTICPAIYAPVCGRNAKGDYKSFASECNQSAFNCLNSKNQYTGKYDLSFCDIEFP

[0057] Due to the redundancy of the genetic code, i.e. more than onenucleotide triplet (codon) pcan code for a single amino acid, more thanone nucleotide sequence can potentially code for cocoon silk protein.Therefore, various other homologues can code for cocoon silk protein.Homology refers to sequence similarity between two peptides or betweentwo nucleic acid molecules. Homology is determined by comparing aposition in each sequence which may be aligned for purposes ofcomparison. When a position in the compared sequence is occupied by thesame base or amino acid, then the molecules are homologous at thatposition.

[0058] Although the invention has been described with reference tocertain specific embodiments, various modifications thereof will beapparent to those skilled in the art without departing from the spiritand scope of the invention as outlined in the claims appended hereto.

The embodiments of the invention in which an exclusive property orprivilege is claimed are defined as follows:
 1. An isolated polypeptidemolecule having an amino acid sequence of SEQ ID NO: 1 or a homologthereof.
 2. An isolated nucleic acid molecule coding for the polypeptideof claim
 1. 3. A nucleic acid molecule comprising the nucleotidesequence of SEQ ID NO: 6 or a homolog thereof.
 4. A polypeptide moleculehaving an amino acid sequence of SEQ ID NO: 7 and expressed by thenucleic acid molecule of claim
 3. 5. The isolated nucleic acid moleculecoding for the polypeptide of claim
 4. 6. A cloning vector comprisingthe nucleotide sequence of SEQ ID NO: 6 or a homolog thereof.
 7. A hostcell transformed with the vector of claim
 6. 8. A fiber formed from thepolypeptide of claim
 4. 9. A method of isolating cocoon silk proteincomprising the steps of: a) boiling a cocoon in a sample reducing bufferto remove SDS-soluble proteins b) centrifuging the sample c) withdrawingsupernatant, adding formic acid to the pellet and incubating the samplein order to solubilize SDS-insoluble proteins d) freezing andlyophilizing the sample in order to freeze-dry the sample e)re-suspending dried sample in TEPI to protect the proteins againstpotential residual proteolytic activity for subsequent analysis usingSDS—PAGE.